1 A well-defined HDX-MS experiment

This vignette describeds how to analyse time-resolved differential HDX-MS experiments. The key elements are at least two conditions i.e. apo + antibody, apo + small molecule or protein closed + protien open, etc. The experiment can be replicated, though if there are sufficient time points analysed (>=3) then occasionally signficant results can be obtained. The data provided should be centroid-centric data. This package does not yet support analysis straight from raw spectra. Typically this will be provided as a .csv from tools such as dynamiX or HDExaminer.

2 Main elements of the package

The package relies of Bioconductor infrastructure so that it integrates with other data types and can benefit from advantages in other fields of mass-spectrometry. There are package specific object, classes and methods but importantly there is reuse of classes found in quantitative proteomics data, mainly the QFeatures object which extends the SummarisedExperiment class for mass spectrometry data. The focus of this package is on testing and visualisation of the testing results.

3 Data

We will begin with a structural variant experiment in which MHP and a structural variant were mixed in different proportions. HDX-MS was performed on these samples and we expect to see reproducible but subtle differences. We first load the data from the package and it is .csv format.

MBPpath <- system.file("extdata", "MBP.csv", package = "hdxstats")

We can now read in the .csv file and have a quick look at the .csv.

MBP <- read.csv(MBPpath)
head(MBP) # have a look
##   hx_sample pep_start pep_end pep_sequence pep_charge     d confidence  score
## 1       10%        19      30 VIWINGDKGYNG          2 2.120     medium 0.8686
## 2       10%        19      30 VIWINGDKGYNG          2 2.146     medium 0.8173
## 3       10%        19      30 VIWINGDKGYNG          2 2.143     medium 0.8839
## 4       10%        19      30 VIWINGDKGYNG          2    NA       <NA>     NA
## 5       10%        19      30 VIWINGDKGYNG          2    NA       <NA>     NA
## 6       10%        19      30 VIWINGDKGYNG          2    NA       <NA>     NA
##   hx_time time_unit replicate_cnt
## 1      30         s             1
## 2      30         s             2
## 3      30         s             3
## 4      30         s             4
## 5      30         s             5
## 6      30         s             6
length(unique(MBP$pep_sequence)) # peptide sequences
## [1] 115

Let us have a quick visualisation of some the data so that we can see some of the features

filter(MBP, pep_sequence == unique(MBP$pep_sequence[1]), pep_charge == 2) %>%
    ggplot(aes(x = hx_time, y = d, group = factor(replicate_cnt),
               color = factor(hx_sample,
                              unique(MBP$hx_sample)[c(7,5,1,2,3,4,6)]))) + 
    theme_classic() + geom_point(size = 2) + 
    scale_color_manual(values = brewer.pal(n = 7, name = "Set2")) + 
    labs(color = "experiment", x = "Deuterium Exposure", y = "Deuterium incoperation")
## Warning: Removed 96 rows containing missing values (geom_point).

We can see that the units of the time dimension are in seconds and that Deuterium incoperation has been normalized into Daltons.

4 Parsing to an object of class QFeatures

Working from a .csv is likely to cause issues downstream. Indeed, we run the risk of accidently changing the data or corrupting the file in some way. Secondly, all .csvs will be formatted slightly different and so making extensible tools for these files will be inefficient. Furthermore, working with a generic class used in other mass-spectrometry fields can speed up analysis and adoption of new methods. We will work the class QFeatures from the QFeatures class as it is a powerful and scalable way to store quantitative mass-spectrometry data.

Firstly, the data is storted in long format rather than wide format. We first switch the data to wide format.

MBP_wide <- pivot_wider(data.frame(MBP),
                        values_from = d,
                        names_from = c("hx_time", "replicate_cnt", "hx_sample"),
                        id_cols = c("pep_sequence", "pep_charge"))
head(MBP_wide)
## # A tibble: 6 x 198
##   pep_sequence pep_charge `30_1_10%` `30_2_10%` `30_3_10%` `30_4_10%` `30_5_10%`
##   <chr>             <int>      <dbl>      <dbl>      <dbl>      <dbl>      <dbl>
## 1 VIWINGDKGYNG          2      2.12       2.15       2.14          NA         NA
## 2 VIWINGDKGYN~          2      2.12       2.12       2.11          NA         NA
## 3 GDKGYNGLAEVG          3      0.552      0.555      0.553         NA         NA
## 4 LAEVGKKFEKD~          4      2.41       2.36       2.42          NA         NA
## 5 AEVGKKFEKDT~          4      0.458      0.425      0.573         NA         NA
## 6 TVEHPDKL              3      1.43       1.44       1.44          NA         NA
## # ... with 191 more variables: 30_6_10% <dbl>, 30_7_10% <dbl>, 240_1_10% <dbl>,
## #   240_2_10% <dbl>, 240_3_10% <dbl>, 240_4_10% <dbl>, 240_5_10% <dbl>,
## #   240_6_10% <dbl>, 240_7_10% <dbl>, 1800_1_10% <dbl>, 1800_2_10% <dbl>,
## #   1800_3_10% <dbl>, 1800_4_10% <dbl>, 1800_5_10% <dbl>, 1800_6_10% <dbl>,
## #   1800_7_10% <dbl>, 14400_1_10% <dbl>, 14400_2_10% <dbl>, 14400_3_10% <dbl>,
## #   14400_4_10% <dbl>, 14400_5_10% <dbl>, 14400_6_10% <dbl>, 14400_7_10% <dbl>,
## #   30_1_15% <dbl>, 30_2_15% <dbl>, 30_3_15% <dbl>, 30_4_15% <dbl>, ...

We notice that there are many columns with NAs. The follow code chunk removes these columns.

MBP_wide <- MBP_wide[, colSums(is.na(MBP_wide)) != nrow(MBP_wide)]

We also note that the colnames are not very informative. We are going to format in a very specific way so that later functions can automatically infer the design from the column names. We provide in the format X(time)rep(replicate)cond(condition)

colnames(MBP_wide)[-c(1,2)]
##   [1] "30_1_10%"        "30_2_10%"        "30_3_10%"        "240_1_10%"      
##   [5] "240_2_10%"       "240_3_10%"       "1800_1_10%"      "1800_2_10%"     
##   [9] "1800_3_10%"      "14400_1_10%"     "14400_2_10%"     "14400_3_10%"    
##  [13] "30_1_15%"        "30_2_15%"        "30_3_15%"        "240_1_15%"      
##  [17] "240_2_15%"       "240_3_15%"       "1800_1_15%"      "1800_2_15%"     
##  [21] "1800_3_15%"      "14400_1_15%"     "14400_2_15%"     "14400_3_15%"    
##  [25] "30_1_20%"        "30_2_20%"        "30_3_20%"        "240_1_20%"      
##  [29] "240_2_20%"       "240_3_20%"       "1800_1_20%"      "1800_2_20%"     
##  [33] "1800_3_20%"      "14400_1_20%"     "14400_2_20%"     "14400_3_20%"    
##  [37] "30_1_25%"        "30_2_25%"        "30_3_25%"        "240_1_25%"      
##  [41] "240_2_25%"       "240_3_25%"       "1800_1_25%"      "1800_2_25%"     
##  [45] "1800_3_25%"      "14400_1_25%"     "14400_2_25%"     "14400_3_25%"    
##  [49] "30_1_5%"         "30_2_5%"         "30_3_5%"         "240_1_5%"       
##  [53] "240_2_5%"        "240_3_5%"        "1800_1_5%"       "1800_2_5%"      
##  [57] "1800_3_5%"       "14400_1_5%"      "14400_2_5%"      "14400_3_5%"     
##  [61] "30_1_W169G"      "30_2_W169G"      "30_3_W169G"      "240_1_W169G"    
##  [65] "240_2_W169G"     "240_3_W169G"     "1800_1_W169G"    "1800_2_W169G"   
##  [69] "1800_3_W169G"    "14400_1_W169G"   "14400_2_W169G"   "14400_3_W169G"  
##  [73] "30_1_WT Null"    "30_2_WT Null"    "30_3_WT Null"    "30_4_WT Null"   
##  [77] "30_5_WT Null"    "30_6_WT Null"    "30_7_WT Null"    "240_1_WT Null"  
##  [81] "240_2_WT Null"   "240_3_WT Null"   "240_4_WT Null"   "240_5_WT Null"  
##  [85] "240_6_WT Null"   "240_7_WT Null"   "1800_1_WT Null"  "1800_2_WT Null" 
##  [89] "1800_3_WT Null"  "1800_4_WT Null"  "1800_5_WT Null"  "1800_6_WT Null" 
##  [93] "1800_7_WT Null"  "14400_1_WT Null" "14400_2_WT Null" "14400_3_WT Null"
##  [97] "14400_4_WT Null" "14400_5_WT Null" "14400_6_WT Null" "14400_7_WT Null"
new.colnames <- gsub("0_", "0rep", paste0("X", colnames(MBP_wide)[-c(1,2)]))
new.colnames <- gsub("_", "cond", new.colnames)

# remove annoying % signs
new.colnames <- gsub("%", "", new.colnames)

# remove space (NULL could get confusing later and WT is clear)
new.colnames <- gsub(" .*", "", new.colnames)

new.colnames
##   [1] "X30rep1cond10"       "X30rep2cond10"       "X30rep3cond10"      
##   [4] "X240rep1cond10"      "X240rep2cond10"      "X240rep3cond10"     
##   [7] "X1800rep1cond10"     "X1800rep2cond10"     "X1800rep3cond10"    
##  [10] "X14400rep1cond10"    "X14400rep2cond10"    "X14400rep3cond10"   
##  [13] "X30rep1cond15"       "X30rep2cond15"       "X30rep3cond15"      
##  [16] "X240rep1cond15"      "X240rep2cond15"      "X240rep3cond15"     
##  [19] "X1800rep1cond15"     "X1800rep2cond15"     "X1800rep3cond15"    
##  [22] "X14400rep1cond15"    "X14400rep2cond15"    "X14400rep3cond15"   
##  [25] "X30rep1cond20"       "X30rep2cond20"       "X30rep3cond20"      
##  [28] "X240rep1cond20"      "X240rep2cond20"      "X240rep3cond20"     
##  [31] "X1800rep1cond20"     "X1800rep2cond20"     "X1800rep3cond20"    
##  [34] "X14400rep1cond20"    "X14400rep2cond20"    "X14400rep3cond20"   
##  [37] "X30rep1cond25"       "X30rep2cond25"       "X30rep3cond25"      
##  [40] "X240rep1cond25"      "X240rep2cond25"      "X240rep3cond25"     
##  [43] "X1800rep1cond25"     "X1800rep2cond25"     "X1800rep3cond25"    
##  [46] "X14400rep1cond25"    "X14400rep2cond25"    "X14400rep3cond25"   
##  [49] "X30rep1cond5"        "X30rep2cond5"        "X30rep3cond5"       
##  [52] "X240rep1cond5"       "X240rep2cond5"       "X240rep3cond5"      
##  [55] "X1800rep1cond5"      "X1800rep2cond5"      "X1800rep3cond5"     
##  [58] "X14400rep1cond5"     "X14400rep2cond5"     "X14400rep3cond5"    
##  [61] "X30rep1condW169G"    "X30rep2condW169G"    "X30rep3condW169G"   
##  [64] "X240rep1condW169G"   "X240rep2condW169G"   "X240rep3condW169G"  
##  [67] "X1800rep1condW169G"  "X1800rep2condW169G"  "X1800rep3condW169G" 
##  [70] "X14400rep1condW169G" "X14400rep2condW169G" "X14400rep3condW169G"
##  [73] "X30rep1condWT"       "X30rep2condWT"       "X30rep3condWT"      
##  [76] "X30rep4condWT"       "X30rep5condWT"       "X30rep6condWT"      
##  [79] "X30rep7condWT"       "X240rep1condWT"      "X240rep2condWT"     
##  [82] "X240rep3condWT"      "X240rep4condWT"      "X240rep5condWT"     
##  [85] "X240rep6condWT"      "X240rep7condWT"      "X1800rep1condWT"    
##  [88] "X1800rep2condWT"     "X1800rep3condWT"     "X1800rep4condWT"    
##  [91] "X1800rep5condWT"     "X1800rep6condWT"     "X1800rep7condWT"    
##  [94] "X14400rep1condWT"    "X14400rep2condWT"    "X14400rep3condWT"   
##  [97] "X14400rep4condWT"    "X14400rep5condWT"    "X14400rep6condWT"   
## [100] "X14400rep7condWT"

We will now parse the data into an object of class QFeatures, we have provided a function to assist with this in the package. If you want to do this yourself use the readQFeatures function from the QFeatures package.

MBPqDF <- parseDeutData(object = DataFrame(MBP_wide),
                        design = new.colnames,
                        quantcol = 3:102)

5 Heatmap visualisations of HDX data

To help us get used to the QFeatures we show how to generate a heatmap of these data from this object:

pheatmap(t(assay(MBPqDF)),
         cluster_rows = FALSE, 
         cluster_cols = FALSE,
         color = brewer.pal(n = 9, name = "BuPu"),
         main = "Stuctural variant deuterium incoperation heatmap", 
         fontsize = 14,
         legend_breaks = c(0, 2, 4, 6, 8, 10, 12, max(assay(MBPqDF))),
         legend_labels = c("0", "2", "4", "6", "8","10", "12", "Incorporation"))

If you prefer to have the start-to-end residue numbers in the heatmap instead you can change the plot as follows:

regions <- unique(MBP[,c("pep_start", "pep_end")])
xannot <- paste0("[", regions[,1], ",", regions[,2], "]")
pheatmap(t(assay(MBPqDF)),
         cluster_rows = FALSE, 
         cluster_cols = FALSE,
         color = brewer.pal(n = 9, name = "BuPu"),
         main = "Stuctural variant deuterium incoperation heatmap", 
         fontsize = 14,
         legend_breaks = c(0, 2, 4, 6, 8, 10, 12, max(assay(MBPqDF))),
         legend_labels = c("0", "2", "4", "6", "8","10", "12", "Incorporation"),
         labels_col = xannot)

It maybe useful to normalize HDX-MS data for either interpretation or visualization purposes. We can normalize by the number of exchangeable amides or by using back-exchange correction values. We first use percentage incorporation as normalisation and visualise as a heatmap.

MBPqDF_norm1 <- normalisehdx(MBPqDF,
                             sequence = unique(MBP$pep_sequence),
                             method = "pc")


pheatmap(t(assay(MBPqDF_norm1)),
         cluster_rows = FALSE, 
         cluster_cols = FALSE,
         color = brewer.pal(n = 9, name = "BuPu"),
         main = "Stuctural variant deuterium incoperation heatmap normalised", 
         fontsize = 14,
         legend_breaks = c(0, 0.2, 0.4, 0.6, 0.8, 1, 1.2),
         legend_labels = c("0", "0.2", "0.4", "0.6", "0.8","1", "Incorporation"),
         labels_col = xannot)

Now, we demonstrate a back-exchange correction calculation. The back-exchange value are fictious by the code chunk below demonstrates how to set this up.

# made-up correction factor
correction <- (exchangeableAmides(unique(MBP$pep_sequence)) + 1) * 0.9

MBPqDF_norm2 <- normalisehdx(MBPqDF,
                             sequence = unique(MBP$pep_sequence),
                             method = "bc", 
                             correction = correction)


pheatmap(t(assay(MBPqDF_norm2)),
         cluster_rows = FALSE, 
         cluster_cols = FALSE,
         color = brewer.pal(n = 9, name = "BuPu"),
         main = "Stuctural variant deuterium incoperation heatmap normalised", 
         fontsize = 14,
         legend_breaks = c(0, 0.2, 0.4, 0.6, 0.8, 1, 1.2),
         legend_labels = c("0", "0.2", "0.4", "0.6", "0.8","1", "Incorporation"),
         labels_col = xannot)

6 Functional data analysis of HDX-MS data

The hdxstats package uses an empirical Bayes functional approach to analyse the data. We explain this idea in steps so that we can get an idea of the approach. First we fit the parametric model to the data. This will allow us to explore the HdxStatModel class.

res <- differentialUptakeKinetics(object = MBPqDF[,1:100], #provide a QFeature object
                                  feature = rownames(MBPqDF)[[1]][37], # which peptide to do we fit
                                  start = list(a = NULL, b = NULL,  d = NULL, p = 1)) # what are the starting parameter guesses
## Warning in differentialUptakeKinetics(object = MBPqDF[, 1:100], feature =
## rownames(MBPqDF)[[1]][37], : NAs introduced by coercion

Here, we see the HdxStatModel class, and that a Functional Model was applied to the data and a total of 7 models were fitted.

res
## Object of class "HdxStatModel"
## Method: Functional Model 
## Fitted 7

The nullmodel and alternative slots of an instance of HdxStatModel provide the underlying fitted models. The method and formula slots provide vital information about what analysis was performed. The vis slot provides a ggplot object so that we can visualise the functional fits.

res@vis

Since this is a ggplot object, we can customise in the usual grammatical ways.

res@vis + scale_color_manual(values = brewer.pal(n = 8, name = "Set2"))
## Scale for 'colour' is already present. Adding another scale for 'colour',
## which will replace the existing scale.

A number of standard methods are available and can be applied to a HdxStatModels, these extend the usual base stats methods. These include

  1. anova: An analysis of variance
  2. logLik: The log-likelihood of all the fitted models
  3. residuals: The residuals for the fitted models
  4. vcov: The variance-covariance matrix between parameters of the models
  5. likRatio: The likelihood ratio between null and alternative models
  6. wilk: Applies wilk’s theorem to obtain a p-value from the liklihood ratio
  7. coef: The fitted model coefficients
  8. deviance: The deviance of the fitted models
  9. summary: The statistical summary of the models.
anova(res)
## Analysis of Variance Table
## 
## Model 1: value ~ a * (1 - exp(-b * (timepoint)^p)) + d
## Model 2: value ~ a * (1 - exp(-b * (timepoint)^p)) + d
## Model 3: value ~ a * (1 - exp(-b * (timepoint)^p)) + d
## Model 4: value ~ a * (1 - exp(-b * (timepoint)^p)) + d
## Model 5: value ~ a * (1 - exp(-b * (timepoint)^p)) + d
## Model 6: value ~ a * (1 - exp(-b * (timepoint)^p)) + d
## Model 7: value ~ a * (1 - exp(-b * (timepoint)^p)) + d
## Model 8: value ~ a * (1 - exp(-b * (timepoint)^p)) + d
##   Res.Df Res.Sum Sq  Df  Sum Sq F value    Pr(>F)    
## 1     96    14.0842                                  
## 2      8     0.0788  88 14.0054 16.1547 0.0001443 ***
## 3      8     0.0969   0  0.0000                      
## 4      8     0.0622   0  0.0000                      
## 5      8     0.0439   0  0.0000                      
## 6      8     0.0891   0  0.0000                      
## 7      8     0.1381   0  0.0000                      
## 8     24     0.1978 -16 -0.0597  0.2162 0.9955202    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
logLik(res)
##       null       alt1       alt2       alt3       alt4       alt5       alt6 
## -43.887971  13.126180  11.888973  14.550056  16.637855  12.387702   9.762426 
##       alt7 
##  29.610175
residuals(res)
## $null
##   [1] -0.104877437 -0.200877437 -0.116877437 -0.041072067 -0.067072067
##   [6] -0.005072067 -0.261216380 -0.227216380 -0.299216380 -0.048625022
##  [11] -0.133625022 -0.110625022 -0.066877437 -0.093877437 -0.179877437
##  [16] -0.007072067  0.025927933  0.025927933 -0.271216380 -0.155216380
##  [21] -0.190216380  0.032374978 -0.015625022  0.059374978 -0.011877437
##  [26] -0.101877437 -0.010877437  0.036927933  0.022927933  0.054927933
##  [31] -0.167216380 -0.130216380 -0.184216380  0.002374978  0.065374978
##  [36] -0.001625022 -0.020877437 -0.071877437 -0.135877437  0.006927933
##  [41]  0.038927933  0.061927933 -0.081216380 -0.097216380 -0.080216380
##  [46] -0.065625022 -0.014625022  0.034374978 -0.109877437 -0.097877437
##  [51] -0.100877437 -0.032072067 -0.064072067 -0.020072067 -0.219216380
##  [56] -0.269216380 -0.371216380 -0.086625022  0.004374978 -0.067625022
##  [61]  0.394122563  0.451122563  0.430122563  1.206927933  1.253927933
##  [66]  1.193927933  1.312783620  1.230783620  1.314783620  0.676374978
##  [71]  0.673374978  0.659374978 -0.164877437 -0.120877437 -0.115877437
##  [76] -0.119877437 -0.106877437 -0.231877437 -0.175877437 -0.114072067
##  [81] -0.196072067 -0.144072067 -0.137072067 -0.154072067 -0.119072067
##  [86] -0.192072067 -0.467216380 -0.448216380 -0.378216380 -0.401216380
##  [91] -0.476216380 -0.416216380 -0.419216380 -0.216625022 -0.197625022
##  [96] -0.128625022 -0.108625022 -0.137625022 -0.192625022 -0.097625022
## attr(,"label")
## [1] "Residuals"
## 
## $alt1
##  [1] -0.009898550 -0.105898550 -0.021898550  0.106888320  0.080888320
##  [6]  0.142888320 -0.081788938 -0.047788938 -0.119788938  0.070970219
## [11] -0.014029781  0.008970219
## attr(,"label")
## [1] "Residuals"
## 
## $alt2
##  [1] -0.004342915 -0.031342915 -0.117342915  0.099863140  0.132863140
##  [6]  0.132863140 -0.157167451 -0.041167451 -0.076167451  0.031013013
## [11] -0.016986987  0.058013013
## attr(,"label")
## [1] "Residuals"
## 
## $alt3
##  [1] -1.156681e-02 -1.015668e-01 -1.056681e-02  9.740434e-02  8.340434e-02
##  [6]  1.154043e-01 -8.042403e-02 -4.342403e-02 -9.742403e-02  8.914765e-05
## [11]  6.308915e-02 -3.910852e-03
## attr(,"label")
## [1] "Residuals"
## 
## $alt4
##  [1]  0.01876067 -0.03223933 -0.09623933  0.04060710  0.07260710  0.09560710
##  [7] -0.05576977 -0.07176977 -0.05476977 -0.03283071  0.01816929  0.06716929
## attr(,"label")
## [1] "Residuals"
## 
## $alt5
##  [1] -0.05659450 -0.04459450 -0.04759450  0.12309108  0.09109108  0.13509108
##  [7] -0.01809493 -0.06809493 -0.17009493 -0.01448807  0.07651193  0.00451193
## attr(,"label")
## [1] "Residuals"
## 
## $alt6
##  [1] -0.11280756 -0.05580756 -0.07680756  0.14555180  0.19255180  0.13255180
##  [7] -0.07978353 -0.16178353 -0.07778353  0.03073824  0.02773824  0.01373824
## attr(,"label")
## [1] "Residuals"
## 
## $alt7
##  [1] -0.066801305 -0.022801305 -0.017801305 -0.021801305 -0.008801305
##  [6] -0.133801305 -0.077801305  0.150955697  0.068955697  0.120955697
## [11]  0.127955697  0.110955697  0.145955697  0.072955697 -0.119152785
## [16] -0.100152785 -0.030152785 -0.053152785 -0.128152785 -0.068152785
## [21] -0.071152785 -0.042024039 -0.023024039  0.045975961  0.065975961
## [26]  0.036975961 -0.018024039  0.076975961
## attr(,"label")
## [1] "Residuals"
vcov(res)
## $nullvcov
##             a            d            p            b
## a 63692.07745 -626.8478954 -62.21434972 -46.23661252
## d  -626.84790    6.4162870   0.62378780   0.45034693
## p   -62.21435    0.6237878   0.06132884   0.04493879
## b   -46.23661    0.4503469   0.04493879   0.03365640
## 
## $altvcov
## $altvcov[[1]]
##              a             d             p             b
## a 6742701.1442 -4768.9958242 -537.34752248 -547.80456073
## d   -4768.9958     3.4943890    0.38629016    0.38674911
## p    -537.3475     0.3862902    0.04315682    0.04361914
## b    -547.8046     0.3867491    0.04361914    0.04451004
## 
## $altvcov[[2]]
##              a             d             p             b
## a 7316298.6577 -5227.1188884 -601.94684015 -584.03713247
## d   -5227.1189     3.8738541    0.43734650    0.41646482
## p    -601.9468     0.4373465    0.04992303    0.04800845
## b    -584.0371     0.4164648    0.04800845    0.04662654
## 
## $altvcov[[3]]
##              a             d             p             b
## a 2578184.1801 -2415.9000724 -285.99824273 -265.71548132
## d   -2415.9001     2.3506192    0.27265968    0.24841571
## p    -285.9982     0.2726597    0.03198786    0.02944404
## b    -265.7155     0.2484157    0.02944404    0.02738933
## 
## $altvcov[[4]]
##             a            d            p            b
## a 5878.577327 -98.50322758 -9.912579881 -5.771303900
## d  -98.503228   1.72275462  0.169505588  0.094988717
## p   -9.912580   0.16950559  0.016883531  0.009648163
## b   -5.771304   0.09498872  0.009648163  0.005707650
## 
## $altvcov[[5]]
##              a             d             p             b
## a 7681647.2937 -4246.4813267 -579.65783162 -473.96913262
## d   -4246.4813     2.4469027    0.32649334    0.26149742
## p    -579.6578     0.3264933    0.04412874    0.03573315
## b    -473.9691     0.2614974    0.03573315    0.02924736
## 
## $altvcov[[6]]
##            a          d            p            b
## a  3.0294725 -2.1842142 -0.144502473  0.137107621
## d -2.1842142  1.6073618  0.103919656 -0.100323599
## p -0.1445025  0.1039197  0.006960795 -0.006582306
## b  0.1371076 -0.1003236 -0.006582306  0.006320456
## 
## $altvcov[[7]]
##              a             d             p             b
## a 2661850.5109 -1.102308e+03 -1.859364e+02 -1.197810e+02
## d   -1102.3084  4.792908e-01  7.867061e-02  4.949860e-02
## p    -185.9364  7.867061e-02  1.311836e-02  8.358982e-03
## b    -119.7810  4.949860e-02  8.358982e-03  5.390538e-03
likRatio(res)
##    logLR 
## 303.7027
wilk(res)
##      p-value 
## 3.010729e-50
coef(res)
##               a          d         p           b
## null  33.440391 0.00000000 0.2088560 0.031873044
## alt1 105.743745 0.06708915 0.2079068 0.009119329
## alt2 105.115429 0.12759940 0.2117926 0.008921556
## alt3  90.532172 0.23941485 0.2140331 0.010025198
## alt4  26.480372 0.00000000 0.2153563 0.038936374
## alt5 108.656492 0.39104991 0.2265998 0.007103946
## alt6   8.328166 0.00000000 0.3086935 0.131389110
## alt7 112.225337 0.62105356 0.2473896 0.005336843
deviance(res)
##        null        alt1        alt2        alt3        alt4        alt5 
## 14.08418512  0.07881381  0.09686220  0.06216397  0.04389538  0.08913641 
##        alt6        alt7 
##  0.13806352  0.19776317
summary(res)
## $null
## 
## Formula: value ~ a * (1 - exp(-b * (timepoint)^p)) + d
## 
## Parameters:
##    Estimate Std. Error t value Pr(>|t|)
## a  33.44039  252.37289   0.133    0.895
## d   0.00000    2.53304   0.000    1.000
## p   0.20886    0.24765   0.843    0.401
## b   0.03187    0.18346   0.174    0.862
## 
## Residual standard error: 0.383 on 96 degrees of freedom
## 
## Number of iterations to convergence: 45 
## Achieved convergence tolerance: 1.49e-08
## 
## 
## $alt1
## 
## Formula: value ~ a * (1 - exp(-b * (timepoint)^p)) + d
## 
## Parameters:
##    Estimate Std. Error t value Pr(>|t|)
## a 1.057e+02  2.597e+03   0.041    0.969
## d 6.709e-02  1.869e+00   0.036    0.972
## p 2.079e-01  2.077e-01   1.001    0.346
## b 9.119e-03  2.110e-01   0.043    0.967
## 
## Residual standard error: 0.09926 on 8 degrees of freedom
## 
## Number of iterations till stop: 96 
## Achieved convergence tolerance: 1.49e-08
## Reason stopped: Number of calls to `fcn' has reached or exceeded `maxfev' == 500.
## 
## 
## $alt2
## 
## Formula: value ~ a * (1 - exp(-b * (timepoint)^p)) + d
## 
## Parameters:
##    Estimate Std. Error t value Pr(>|t|)
## a 1.051e+02  2.705e+03   0.039    0.970
## d 1.276e-01  1.968e+00   0.065    0.950
## p 2.118e-01  2.234e-01   0.948    0.371
## b 8.922e-03  2.159e-01   0.041    0.968
## 
## Residual standard error: 0.11 on 8 degrees of freedom
## 
## Number of iterations till stop: 95 
## Achieved convergence tolerance: 1.49e-08
## Reason stopped: Number of calls to `fcn' has reached or exceeded `maxfev' == 500.
## 
## 
## $alt3
## 
## Formula: value ~ a * (1 - exp(-b * (timepoint)^p)) + d
## 
## Parameters:
##    Estimate Std. Error t value Pr(>|t|)
## a 9.053e+01  1.606e+03   0.056    0.956
## d 2.394e-01  1.533e+00   0.156    0.880
## p 2.140e-01  1.789e-01   1.197    0.266
## b 1.003e-02  1.655e-01   0.061    0.953
## 
## Residual standard error: 0.08815 on 8 degrees of freedom
## 
## Number of iterations till stop: 95 
## Achieved convergence tolerance: 1.49e-08
## Reason stopped: Number of calls to `fcn' has reached or exceeded `maxfev' == 500.
## 
## 
## $alt4
## 
## Formula: value ~ a * (1 - exp(-b * (timepoint)^p)) + d
## 
## Parameters:
##   Estimate Std. Error t value Pr(>|t|)
## a 26.48037   76.67188   0.345    0.739
## d  0.00000    1.31254   0.000    1.000
## p  0.21536    0.12994   1.657    0.136
## b  0.03894    0.07555   0.515    0.620
## 
## Residual standard error: 0.07407 on 8 degrees of freedom
## 
## Number of iterations to convergence: 45 
## Achieved convergence tolerance: 1.49e-08
## 
## 
## $alt5
## 
## Formula: value ~ a * (1 - exp(-b * (timepoint)^p)) + d
## 
## Parameters:
##    Estimate Std. Error t value Pr(>|t|)
## a 1.087e+02  2.772e+03   0.039    0.970
## d 3.911e-01  1.564e+00   0.250    0.809
## p 2.266e-01  2.101e-01   1.079    0.312
## b 7.104e-03  1.710e-01   0.042    0.968
## 
## Residual standard error: 0.1056 on 8 degrees of freedom
## 
## Number of iterations till stop: 95 
## Achieved convergence tolerance: 1.49e-08
## Reason stopped: Number of calls to `fcn' has reached or exceeded `maxfev' == 500.
## 
## 
## $alt6
## 
## Formula: value ~ a * (1 - exp(-b * (timepoint)^p)) + d
## 
## Parameters:
##   Estimate Std. Error t value Pr(>|t|)   
## a  8.32817    1.74054   4.785  0.00138 **
## d  0.00000    1.26782   0.000  1.00000   
## p  0.30869    0.08343   3.700  0.00604 **
## b  0.13139    0.07950   1.653  0.13700   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1314 on 8 degrees of freedom
## 
## Number of iterations to convergence: 32 
## Achieved convergence tolerance: 1.49e-08
## 
## 
## $alt7
## 
## Formula: value ~ a * (1 - exp(-b * (timepoint)^p)) + d
## 
## Parameters:
##    Estimate Std. Error t value Pr(>|t|)  
## a 1.122e+02  1.632e+03   0.069    0.946  
## d 6.211e-01  6.923e-01   0.897    0.379  
## p 2.474e-01  1.145e-01   2.160    0.041 *
## b 5.337e-03  7.342e-02   0.073    0.943  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.09078 on 24 degrees of freedom
## 
## Number of iterations till stop: 95 
## Achieved convergence tolerance: 1.49e-08
## Reason stopped: Number of calls to `fcn' has reached or exceeded `maxfev' == 500.

7 Analysis of a typical HDX-MS experiment

We have seen the basic aspects of our functional modelling approach. We now wish to roll out our method across all peptides in the experiment. The fitUptakeKinetics function allows us to apply our modelling approach across all the peptide in the experiment. We need to provide a QFeatures object and the features for which we are fitting the model. The design will be extracted from the column names or you can provide a design yourself. The parameter initilisation should also be provided. Sometimes the model can’t be fit on the kinetics. This is either because there is not enough data or through lack of convergence. An error will be reported in these cases but this should not perturb the user. You may wish to try a few starting values if there excessive models that fail fitting.

res <- fitUptakeKinetics(object = MBPqDF[,c(1:24)],
                         feature = rownames(MBPqDF[,c(1:24)])[[1]],
                         start = list(a = NULL, b = 0.001,  d = NULL, p = 1))
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## [1] "Could not fit model, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## [1] "Could not fit model, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## [1] "Could not fit model, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## [1] "Could not fit model, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## [1] "model fit failed, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## [1] "Could not fit model, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## [1] "Could not fit model, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## [1] "model fit failed, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## [1] "Could not fit model, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## [1] "Could not fit model, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## [1] "model fit failed, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## [1] "Could not fit model, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## [1] "model fit failed, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## [1] "Could not fit model, likely exessive missing values"

The code chunk above returns a class HdxStatModels indicating that a number of models for peptide have been fit. This is simply a holder for a list of HdxStatModel instances.

res
## Object of class "HdxStatModels"
## Number of models 101

We can easily examine indivual fits by going to the underyling HdxStatModel class:

res@statmodels[[1]]@vis + scale_color_manual(values = brewer.pal(n = 2, name = "Set2"))
## Warning in brewer.pal(n = 2, name = "Set2"): minimal value for n is 3, returning requested palette with 3 different levels
## Scale for 'colour' is already present. Adding another scale for 'colour',
## which will replace the existing scale.

We now wish to apply statistical analysis to these fitted curves. Our approach is an empirical Bayes testing procedure, which borrows information across peptides to stablise variance estimates. Here, we need to provide the original data that was analysed and the HdxStatModels class. The following code chunk returns an object of class HdxStatRes. This object tell us that statistical analysis was performed using our Functional model.

out <- processFunctional(object = MBPqDF[,1:24], params = res)
out
## Object of class "HdxStatRes"
## Analysed using Functional model

The main slot of interest is the results slot which returns quantities of interest such as p-values and fdr corrected p-values because of multiple testing. The following is the DataFrame of interest.

out@results
## DataFrame with 101 rows and 8 columns
##                           Fstat.Fstat Fstat.numerator Fstat.denomenator
##                                <list>          <list>            <list>
## VIWINGDKGYNG_2               0.709752      0.00109177        0.00153824
## VIWINGDKGYNGL_2             0.0329087     6.17994e-05         0.0018779
## GDKGYNGLAEVG_3               0.980103      0.00129713        0.00132346
## LAEVGKKFEKDTGIKVTVEHPDK_4    0.809885      0.00117025        0.00144496
## AEVGKKFEKDTGIKV_4            0.806449      0.00140588        0.00174329
## ...                               ...             ...               ...
## WYAVRTAVINAASGRQTVDE_3       -3.98233        -3.15572          0.792429
## YAVRTAVINAASGRQTVDE_3        -3.96761        -2.79196          0.703687
## AVINA_1                      -3.87726       -0.213395         0.0550377
## AVINAASGRQTVDEAL_2           0.905874      0.00277575        0.00306417
## ASGRQTVDEAL_2                0.364877      0.00215211        0.00589818
##                               pvals       fdr ebayes.pvals ebayes.fdr
##                           <numeric> <numeric>    <numeric>  <numeric>
## VIWINGDKGYNG_2             0.597063         1     0.588708          1
## VIWINGDKGYNGL_2            0.997693         1     0.997546          1
## GDKGYNGLAEVG_3             0.445918         1     0.442337          1
## LAEVGKKFEKDTGIKVTVEHPDK_4  0.536935         1     0.530267          1
## AEVGKKFEKDTGIKV_4          0.538920         1     0.526221          1
## ...                             ...       ...          ...        ...
## WYAVRTAVINAASGRQTVDE_3     1.000000         1     1.000000          1
## YAVRTAVINAASGRQTVDE_3      1.000000         1     1.000000          1
## AVINA_1                    1.000000         1     1.000000          1
## AVINAASGRQTVDEAL_2         0.483850         1     0.457615          1
## ASGRQTVDEAL_2              0.830033         1     0.811168          1
##                           fitcomplete
##                             <integer>
## VIWINGDKGYNG_2                      1
## VIWINGDKGYNGL_2                     2
## GDKGYNGLAEVG_3                      3
## LAEVGKKFEKDTGIKVTVEHPDK_4           4
## AEVGKKFEKDTGIKV_4                   5
## ...                               ...
## WYAVRTAVINAASGRQTVDE_3             97
## YAVRTAVINAASGRQTVDE_3              98
## AVINA_1                            99
## AVINAASGRQTVDEAL_2                100
## ASGRQTVDEAL_2                     101

We can now examine the peptides for which the false discovery rate is less than 0.05

which(out@results$ebayes.fdr < 0.05)
## EAAFNKGETAM_2   AAFNKGETA_2  AAFNKGETAM_2 WYAVRTAVINA_2 
##            55            56            57            96

Let us visualise some of these examples:

res@statmodels[[42]]@vis + res@statmodels[[45]]@vis

As we can see our model has picked up some subtle differences, we can further visualise these using a forest plot. We can see the the functions are very similar as the parameters are almost identical (a,b,p,d). However, we can see that the deuterium differences are lower in 10% structural variant condition.

fp <- forestPlot(params = res@statmodels[[42]])

We can produce a table to actual numbers. We see that at all 4 timepoints the deuterium difference is negative, though the confidence intervals overlap with 0. Our functional approach is picking up this small but reproducible difference.

knitr::kable(fp$data)
Estimate confL confU rownames condition
a 11.1549013 -57.1059774 79.4157799 a 1
d 0.0000000 -67.8287075 67.8287075 d 1
p 0.1331966 -0.2441938 0.5105870 p 1
b 1.2582482 -4.0600159 6.5765122 b 1
a1 11.1549013 -57.1059774 79.4157799 a 2
d1 0.0000000 -67.8287075 67.8287075 d 2
p1 0.1331966 -0.2441938 0.5105870 p 2
b1 1.2582482 -4.0600159 6.5765122 b 2
1 -0.1336667 -0.2728705 0.0055372 Timepoint 30 Deuterium Difference
2 -0.0350000 -0.2080751 0.1380751 Timepoint 240 Deuterium Difference
3 -0.0196667 -0.0873897 0.0480564 Timepoint 1800 Deuterium Difference
4 0.0343333 -0.1120953 0.1807620 Timepoint 14400 Deuterium Difference

It is also possible to visualize, these plots on a different scale. Of course, changing the natural scaling will emphasis different parts of the plot and could distort interpretation. In particular, if a log transform is used then care should be taken when interpreting values around 0. We suggest examining the numerical values in a forest plot or table alongside any transformation of the variables. We suggest using the pseudo log transform as this allows control the linearity of the plot, clearly demonstrating this a choice of visualisation (and not of statistical modelling). The parameter sigma below controls the scaling factor of the linear part of the transformation.

res@statmodels[[42]]@vis + scale_x_continuous(
    trans = pseudo_log_trans(base = 10, sigma = 0.01), breaks = c(0, 10^(1:7)))

res@statmodels[[42]]@vis + scale_x_continuous(
    trans = pseudo_log_trans(base = 10, sigma = 0.0001), breaks = c(0, 10^(1:7)))

res@statmodels[[42]]@vis + scale_x_continuous(
    trans = pseudo_log_trans(base = 10, sigma = 10), breaks = c(0, 10^(1:7)))

Let’s us now have a look a situation where the changes are more dramatic.

res_wt <- fitUptakeKinetics(object = MBPqDF[, c(61:100)],
                            feature = rownames(MBPqDF[, c(61:100)])[[1]],
                            start = list(a = NULL, b = 0.001,  d = NULL, p = 1))
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## [1] "Could not fit model, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## [1] "Could not fit model, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## [1] "model fit failed, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## [1] "Could not fit model, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## [1] "model fit failed, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## [1] "Could not fit model, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## [1] "Could not fit model, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## [1] "Could not fit model, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## [1] "Could not fit model, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## [1] "Could not fit model, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## [1] "Could not fit model, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## [1] "model fit failed, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## [1] "Could not fit model, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## [1] "model fit failed, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## [1] "Could not fit model, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## [1] "Could not fit model, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## [1] "Could not fit model, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## [1] "Could not fit model, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## [1] "Could not fit model, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## [1] "Could not fit model, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion

## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## Error in nlsModel(formula, mf, start, wts) : 
##   singular gradient matrix at initial parameter estimates
## [1] "Could not fit model, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## [1] "model fit failed, likely exessive missing values"
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## [1] "model fit failed, likely exessive missing values"
out_wt <- processFunctional(object = MBPqDF[, c(61:100)], params = res_wt)

We can visualise some of the result and generate plots.

res_wt@statmodels[[27]]@vis/res_wt@statmodels[[28]]@vis + plot_layout(guides = "collect")|(forestPlot(params = res_wt@statmodels[[27]], condition = c("WT", "W169G"))/forestPlot(params = res_wt@statmodels[[28]], condition = c("WT", "W169G")) + plot_layout(guides = "collect")) + 
    plot_annotation(tag_levels = 'a') +  plot_layout(widths = c(1, 1))

8 An epitope mapping experiment

We now describe the analysis of an epitope mapping experiment. Here, the data analysis is more challenging, since only 1 replicate in each condition, apo and antibody, was performed. If we make some simplifying assumptions rigorous statistical analysis can still be performed.

The experiment was performed on HOIP-RBR, we loaded the data below from inside the package

HOIPpath <- system.file("extdata", "N64184_1a2_state.csv", package = "hdxstats")
HOIP <- read.csv(HOIPpath)
unique(HOIP$State)
##  [1] "apo"     "dAb13_1" "dAb13_2" "dAb25_1" "dAb25_2" "dAb27_1" "dAb27_2"
##  [8] "dAb2_1"  "dAb2_2"  "dAb6_1"  "dAb6_2"
HOIP$Exposure <- HOIP$Exposure * 60 #convert to seconds
filter(HOIP, Sequence == unique(HOIP$Sequence[1])) %>%
    ggplot(aes(x = Exposure,
               y = Center,
               color = factor(State, unique(HOIP$State)))) +
    theme_classic() + geom_point(size = 3) + 
    scale_color_manual(values = colorRampPalette(brewer.pal(8, name = "Set2"))(11)) + 
    labs(color = "experiment", x = "Deuterium Exposure", y = "Deuterium incoperation")

As before we need to convert data to an object of classes QFeatures for ease of analysis.

First, we put the data into a DataFrame object. Currently, its in long format so we switch to a wide format

HOIP_wide <- pivot_wider(data.frame(HOIP),
                         values_from = Center,
                         names_from = c("Exposure", "State"),
                         id_cols = c("Sequence"))

Now remove all columns with only NAs

HOIP_wide <- HOIP_wide[, colSums(is.na(HOIP_wide)) != nrow(HOIP_wide)]

The colanmes are not very informative, provide in the format X(time)rep(repliate)cond(condition)

colnames(HOIP_wide)[-c(1)]
##  [1] "0_apo"       "30_apo"      "300_apo"     "0_dAb13_1"   "30_dAb13_1" 
##  [6] "300_dAb13_1" "0_dAb13_2"   "30_dAb13_2"  "300_dAb13_2" "0_dAb25_1"  
## [11] "30_dAb25_1"  "300_dAb25_1" "0_dAb25_2"   "30_dAb25_2"  "300_dAb25_2"
## [16] "0_dAb27_1"   "30_dAb27_1"  "300_dAb27_1" "0_dAb27_2"   "30_dAb27_2" 
## [21] "300_dAb27_2" "0_dAb2_1"    "30_dAb2_1"   "300_dAb2_1"  "0_dAb2_2"   
## [26] "30_dAb2_2"   "300_dAb2_2"  "0_dAb6_1"    "30_dAb6_1"   "300_dAb6_1" 
## [31] "0_dAb6_2"    "30_dAb6_2"   "300_dAb6_2"
new.colnames <- gsub("0_", "0rep1", paste0("X", colnames(HOIP_wide)[-c(1)]))
new.colnames <- gsub("rep1", "rep1cond", new.colnames)

# remove annoying % signs
new.colnames <- gsub("%", "", new.colnames)

# remove space (NULL could get confusing later and WT is clear)
new.colnames <- gsub(" .*", "", new.colnames)

Now, we can provide rownames and convert the data to a QFeatures object:

qDF <- parseDeutData(object = DataFrame(HOIP_wide),
                     design = new.colnames,
                     quantcol = 2:34,
                     rownames = HOIP_wide$Sequence)

As before, we can produce a heatmap, we perform a simple normalisation for ease of visualisation:

mat <- assay(qDF)
mat <- apply(mat, 2, function(x) x - assay(qDF)[,1])

pheatmap(t(mat),
         cluster_rows = FALSE,
         cluster_cols = FALSE,
         color = brewer.pal(n = 9, name = "BuPu"),
         main = "HOIP RBR heatmap",
         fontsize = 14,
         legend_breaks = c(0, 2, 4, 6,8,10,12, max(assay(qDF))),
         legend_labels = c("0", "2", "4", "6", "8","10", "12", "Incorporation"))

Let us first perform a quick test:

res <- differentialUptakeKinetics(object = qDF[,1:33],
                                  feature = rownames(qDF)[[1]][3],
                                  start = list(a = NULL, b = 0.01,  d = NULL),
                                  formula = value ~ a * (1 - exp(-b*(timepoint))) + d) 
## Warning in differentialUptakeKinetics(object = qDF[, 1:33], feature =
## rownames(qDF)[[1]][3], : NAs introduced by coercion
res@vis+ scale_color_manual(values = colorRampPalette(brewer.pal(8, name = "Set2"))(11)) 
## Scale for 'colour' is already present. Adding another scale for 'colour',
## which will replace the existing scale.

Whilst this analysis performs good fits for the functions, there are too many degrees of freedom to perform sound statistical analysis. Hence, we normalize to remove the degree of freedom for the intercept. For simplicity and to preserve the original matrix, we reprocess the data. We then fit a simplified kinetic model, where only the plateau is inferred.

cn <- new.colnames[c(1:3,10:12)]
HOIP_wide_nrm <- data.frame(HOIP_wide)
HOIP_wide_nrm[, c(2:4)] <- HOIP_wide_nrm[,c(2:4)] - HOIP_wide_nrm[,c(2)] # normalise by intercept
HOIP_wide_nrm[, c(11:13)] <- HOIP_wide_nrm[,c(11:13)] - HOIP_wide_nrm[,c(11)] # normalised by intercept

newqDF <- parseDeutData(object = DataFrame(HOIP_wide_nrm),
                        design = cn,
                        quantcol = c(2:4, 11:13), rownames = HOIP_wide$Sequence)

res_all <- fitUptakeKinetics(object = newqDF[,1:6],
                             feature = rownames(newqDF[,1:6])[[1]],
                             start = list(a = NULL),
                             formula = value ~ a * (1 - exp(-0.05*(timepoint))), maxAttempts = 1)
  
funresdAb25_1 <- processFunctional(object = newqDF[,1:6],
                                   params = res_all)

We can have a look at the results:

funresdAb25_1@results
## DataFrame with 110 rows and 8 columns
##                           Fstat.Fstat Fstat.numerator Fstat.denomenator
##                                <list>          <list>            <list>
## GPGQECA                      0.505943       0.0236133         0.0466718
## CAVCGWALPHNRM                0.819962       0.0111598         0.0136102
## CAVCGWALPHNRMQAL              3.18841        0.130744         0.0410059
## CAVCGWALPHNRMQALTSCE          1.51203        0.598316          0.395703
## AVCGWALPHNRM                 0.138914      0.00401256         0.0288852
## ...                               ...             ...               ...
## ATERYLHVRPQPLAGEDPPAYQARL    0.882916       0.0634907         0.0719102
## ERYLHVRPQPLAGEDPPAYQ         0.841277       0.0671491         0.0798181
## ERYLHVRPQPLAGEDPPAYQARL       2.11026        0.172771          0.081872
## RYLHVRPQPLAGEDPPAYQARL        1.65969        0.120275         0.0724686
## LQKLTEEVPLGQSIPRRRK           4.48223        0.196504         0.0438406
##                               pvals       fdr ebayes.pvals ebayes.fdr
##                           <numeric> <numeric>    <numeric>  <numeric>
## GPGQECA                    0.516181  0.652643     0.479007   0.612683
## CAVCGWALPHNRM              0.416403  0.558589     0.406959   0.545920
## CAVCGWALPHNRMQAL           0.148709  0.320744     0.123017   0.300707
## CAVCGWALPHNRMQALTSCE       0.286210  0.499732     0.237829   0.441045
## AVCGWALPHNRM               0.728272  0.801099     0.708845   0.779730
## ...                             ...       ...          ...        ...
## ATERYLHVRPQPLAGEDPPAYQARL  0.400605  0.557804    0.3563859   0.502596
## ERYLHVRPQPLAGEDPPAYQ       0.410930  0.558053    0.3658748   0.509446
## ERYLHVRPQPLAGEDPPAYQARL    0.219967  0.443454    0.1817334   0.386882
## RYLHVRPQPLAGEDPPAYQARL     0.267115  0.481684    0.2263725   0.436859
## LQKLTEEVPLGQSIPRRRK        0.101671  0.266280    0.0814249   0.246599
##                           fitcomplete
##                             <integer>
## GPGQECA                             1
## CAVCGWALPHNRM                       2
## CAVCGWALPHNRMQAL                    3
## CAVCGWALPHNRMQALTSCE                4
## AVCGWALPHNRM                        5
## ...                               ...
## ATERYLHVRPQPLAGEDPPAYQARL         106
## ERYLHVRPQPLAGEDPPAYQ              107
## ERYLHVRPQPLAGEDPPAYQARL           108
## RYLHVRPQPLAGEDPPAYQARL            109
## LQKLTEEVPLGQSIPRRRK               110
which(funresdAb25_1@results$ebayes.fdr < 0.05)
##           IQLRESLEPDA RESLEPDAYALFHKKLTEGVL         YALFHKKLTEGVL 
##                    36                    42                    43 
##       REQLEATCPQCHQTF           EATCPQCHQTF       MYLQENGIDCPKCKF 
##                    52                    53                    65 
##     YLQENGIDCPKCKFSYA      LQENGIDCPKCKFSYA 
##                    68                    70

We can plot these kinetics to see what is happening. This allows us to visualise region of protection and deprotection, potentially identifiying the epitope.

(res_all@statmodels[[36]]@vis + 
res_all@statmodels[[42]]@vis  + 
res_all@statmodels[[43]]@vis  + 
res_all@statmodels[[65]]@vis  + 
res_all@statmodels[[68]]@vis  + 
res_all@statmodels[[70]]@vis  + 
res_all@statmodels[[52]]@vis  + 
res_all@statmodels[[53]]@vis ) + plot_layout(guides = 'collect')

We can make a Manhattan plot to better specially visualise what’s happening.

#We need to provide an indication of "difference" so we can examine deprotected
# or prected regions
diffdata <- assay(newqDF)[,6] - assay(newqDF)[,3]


sigplots <- manhattanplot(params = funresdAb25_1,
                          sequences = HOIP$Sequence, 
                          region = HOIP[, c("Start", "End")],
                          difference = diffdata,
                          nrow = 1)
## Scale for 'x' is already present. Adding another scale for 'x', which will
## replace the existing scale.
sigplots[[1]] + plot_layout(guides = 'collect')

We can visualise this in a peptide plot which helps us understand the nature of the overlap

fpath <- system.file("extdata", "HOIP.txt", package = "hdxstats", mustWork = TRUE)
HOIPfasta <- readAAStringSet(filepath = fpath, "fasta")

scores <- funresdAb25_1@results$ebayes.fdr

out <- plotEpitopeMap(AAString = HOIPfasta[[1]],
                      peptideSeqs = unique(HOIP$Sequence),
                      numlines = 2,
                      maxmismatch = 1,
                      by = 1,
                      scores = 1 * (-log10(scores[unique(HOIP$Sequence)])  > -log10(0.05)) + 0.0001,
                      name = "significant")
## Warning in brewer.pal(n = 2, name = "Set2"): minimal value for n is 3, returning requested palette with 3 different levels
## Scale for 'y' is already present. Adding another scale for 'y', which will
## replace the existing scale.
## Scale for 'y' is already present. Adding another scale for 'y', which will
## replace the existing scale.
out[[1]]/(out[[2]]) + plot_layout(guides = 'collect') & theme(legend.position = "right")

We can further visualise this a barcode of particular residues, here we use residue level averaging to obtain results at the residue level.

scores <- funresdAb25_1@results$ebayes.fdr
out2 <- plotEpitopeMapResidue(AAString = HOIPfasta[[1]],
                              peptideSeqs = unique(HOIP$Sequence),
                              numlines = 2,
                              maxmismatch = 1,
                              by = 5,
                              scores = scores[unique(HOIP$Sequence)],
                              name = "-log10 p value")
## Scale for 'y' is already present. Adding another scale for 'y', which will
## replace the existing scale.
## Scale for 'y' is already present. Adding another scale for 'y', which will
## replace the existing scale.
out2[[1]]/out2[[2]]  + plot_layout(guides = 'collect') & theme(legend.position = "right")

We can also plot multiple residue maps on the same plot so that we can compare different antibodies.

scores <- funresdAb25_1@results$ebayes.fdr
avMap25_1 <- ComputeAverageMap(AAString = HOIPfasta[[1]],
                               peptideSeqs = unique(HOIP$Sequence),
                               numlines = 2, maxmismatch = 1,
                               by = 10, scores = scores[unique(HOIP$Sequence)],
                               name = "-log10 p value")

## generate results from other dAB
cn <- new.colnames[c(1:3,19:21)]
HOIP_wide_nrm <- data.frame(HOIP_wide)
HOIP_wide_nrm[,c(2:4)] <- HOIP_wide_nrm[,c(2:4)] - HOIP_wide_nrm[,c(2)]
HOIP_wide_nrm[,c(20:22)] <- HOIP_wide_nrm[,c(20:22)] - HOIP_wide_nrm[,c(20)] 

newqDF2 <- parseDeutData(object = DataFrame(HOIP_wide_nrm),
                        design = cn,
                        quantcol = c(2:4,20:22),
                        rownames = HOIP_wide$Sequence)

res_all2 <- fitUptakeKinetics(object = newqDF2[,1:6],
                             feature = rownames(newqDF2[,1:6])[[1]],
                             start = list(a = NULL),
                             formula = value ~ a * (1 - exp(-0.07*(timepoint))), maxAttempts = 1)
## Warning in max(data$value): no non-missing arguments to max; returning -Inf
## Warning in differentialUptakeKinetics(object = object, feature = x, start =
## start, : NAs introduced by coercion
## [1] "model fit failed, likely exessive missing values"
funresdAb27_2 <- processFunctional(object = newqDF[,1:6],
                                   params = res_all2)

scores <- funresdAb27_2@results$ebayes.fdr
# compute average map
avMap27_2  <- ComputeAverageMap(AAString = HOIPfasta[[1]],
                                peptideSeqs = unique(HOIP$Sequence),
                                numlines = 2,
                                maxmismatch = 1,
                                by = 10,
                                scores = scores[unique(HOIP$Sequence)],
                                name = "-log10 p value")

# set rownames
rownames(avMap25_1) <- "dAb25_1"
rownames(avMap27_2) <- "dAb27_2"

# store in a list
avMap <- list(avMap27_2 = avMap27_2,
              avMap25_1 = avMap25_1)

#plotting
out3 <- plotAverageMaps(avMap, by = 20)
## Scale for 'y' is already present. Adding another scale for 'y', which will
## replace the existing scale.
out3[[1]]/out3[[2]]  + plot_layout(guides = 'collect') & theme(legend.position = "right")

9 Protection/deprotection maps

Users may also wish to colour according to protection or deprotections of hdx. The following functions determine whether there is protection or not by computing deuterium difference. The users pass the argument for which to compute the differences. The user pass can any scores though it is useful to look at the FDR provided by previous analysis. As is typical blue indicates protection whilst red deprotection. The binding epitope is very clear from these diagrams.

scores <- funresdAb27_2@results$ebayes.fdr
hdxdiff1 <- hdxdifference(object = newqDF2,
              AAString = HOIPfasta[[1]],
              peptideSeqs = unique(HOIP$Sequence),
              numlines = 2,
              maxmismatch = 1,
              by = 10,
              scores = scores[unique(HOIP$Sequence)],
              cols = c(2,5), # columns of object to use for differences
              name = "-log10 p value (signed)")
## Warning in hdxdifference(object = newqDF2, AAString = HOIPfasta[[1]],
## peptideSeqs = unique(HOIP$Sequence), : NaNs produced
scores <- funresdAb25_1@results$ebayes.fdr
hdxdiff2 <- hdxdifference(object = newqDF,
              AAString = HOIPfasta[[1]],
              peptideSeqs = unique(HOIP$Sequence),
              numlines = 2,
              maxmismatch = 1,
              by = 10,
              scores = scores[unique(HOIP$Sequence)],
              cols = c(2,5),
              name = "-log10 p value (signed)")
## Warning in hdxdifference(object = newqDF, AAString = HOIPfasta[[1]], peptideSeqs
## = unique(HOIP$Sequence), : NaNs produced
dMap <- list(hdxdiff1$diffMap, hdxdiff2$diffMap)
out4 <- hdxheatmap(averageMaps = avMap, diffMaps = dMap)
## Scale for 'y' is already present. Adding another scale for 'y', which will
## replace the existing scale.
out4[[1]]/out4[[2]]  + plot_layout(guides = 'collect') & theme(legend.position = "right")

10 Multi-level testing

Advanced users and those struggling to find any significant results may wish to employ a more complex method. Here, we introduce a multi-level testing procedure that allows us to combine p-values in the spatial axis of the data. Individual p-value may not be significant but once grouped we might obtain additional power. This approach also controls a different form of multiplicity correction called the strong-sense family wise error rate (ssFWER). The FWER is the probability of of making at least one type 1 error (false discovery) and is hence a more stringent notion than FDR. We control this quantity in the strong sense, which means we do not require the global null hypothesis to be true. Typically, we wish to control this quantity at some level \(\alpha = 0.05\), which means the probability of there being at least one false discovery is less than \(0.05\). Our strategy combines \(p\)-values in the spatial axis using the harmonic mean p-value approach. The cost for this is that the regions which we cover become wider, potentially impacting interpretation. Furthermore, becuase of combining p-values it no longer makes sense to examine protection/deprotection factors. However, it is valid to then examine only the region with significant results without affecting multiplicity. In the following code chunk, we combined p-values spatially and control for multiplicity using an interval of size 20. This means a sliding window of \(20\) peptide is used to combine p-values. The region that this covers is plotted in the x-axis. For example, from the plot below we can deduce that there is at least 1 peptide in region [49,120] where there is a significant difference, where the probability that this is a false discovery is \(0.05\).

hmpplot <- hmpWindow(params = funresdAb25_1,
                     sequences = HOIP$Sequence,
                     region = HOIP[, c("Start", "End")],
                     interval = 20)
## Scale for 'x' is already present. Adding another scale for 'x', which will
## replace the existing scale.
hmpplot
## [[1]]

11 Exporting results

We suggest saving files in .rds format. However, if you wish to export data to .csv format this can be done as in the following exampe.

# not run
write.csv(funresdAb25_1@results, file = "myhdxresuts.csv")